Goto

Collaborating Authors

 peer review


Imitating Mistakes in a Learning Companion AI Agent for Online Peer Learning

Moribe, Sosui, Ushiama, Taketoshi

arXiv.org Artificial Intelligence

In recent years, peer learning has gained attention as a method that promotes spontaneous thinking among learners, and its effectiveness has been confirmed by numerous studies. This study aims to develop an AI Agent as a learning companion that enables peer learning anytime and anywhere. However, peer learning between humans has various limitations, and it is not always effective. Effective peer learning requires companions at the same proficiency levels. In this study, we assume that a learner's peers with the same proficiency level as the learner make the same mistakes as the learner does and focus on English composition as a specific example to validate this approach.


Aligned Textual Scoring Rules

Lu, Yuxuan, Wu, Yifan, Hartline, Jason, Curry, Michael J.

arXiv.org Artificial Intelligence

Scoring rules elicit probabilistic predictions from a strategic agent by scoring the prediction against a ground truth state. A scoring rule is proper if, from the agent's perspective, reporting the true belief maximizes the expected score. With the development of language models, Wu and Hartline (2024) proposes a reduction from textual information elicitation to the numerical (i.e. probabilistic) information elicitation problem, which achieves provable properness for textual elicitation. However, not all proper scoring rules are well aligned with human preference over text. Our paper designs the Aligned Scoring rule (ASR) for text by optimizing and minimizing the mean squared error between a proper scoring rule and a reference score (e.g. human score). Our experiments show that our ASR outperforms previous methods in aligning with human preference while maintaining properness.


Peer Review as Structured Commentary: Immutable Identity, Public Dialogue, and Reproducible Scholarship

Wright, Craig Steven

arXiv.org Artificial Intelligence

This paper reconceptualises peer review as structured public commentary. Traditional academic validation is hindered by anonymity, latency, and gatekeeping. We propose a transparent, identity-linked, and reproducible system of scholarly evaluation anchored in open commentary. Leveraging blockchain for immutable audit trails and AI for iterative synthesis, we design a framework that incentivises intellectual contribution, captures epistemic evolution, and enables traceable reputational dynamics. This model empowers fields from computational science to the humanities, reframing academic knowledge as a living process rather than a static credential.


Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

We thank the reviewers for their thoughtful comments about the paper. Specific replies, where warranted, follow. Reviewer 1: We wanted to thank the reviewer for the additional relevant references. We will do a better job of proofreading the paper. Because the arguments were not straightforward, we felt it was worthwhile to sacrifice some exposition to make sure all arguments were exposed to peer review.


Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation

Eger, Steffen, Cao, Yong, D'Souza, Jennifer, Geiger, Andreas, Greisinger, Christian, Gross, Stephanie, Hou, Yufang, Krenn, Brigitte, Lauscher, Anne, Li, Yizhi, Lin, Chenghua, Moosavi, Nafise Sadat, Zhao, Wei, Miller, Tristan

arXiv.org Artificial Intelligence

With the advent of large multimodal language models, science is now at a threshold of an AI-based technological transformation. Recently, a plethora of new AI models and tools has been proposed, promising to empower researchers and academics worldwide to conduct their research more effectively and efficiently. This includes all aspects of the research cycle, especially (1) searching for relevant literature; (2) generating research ideas and conducting experimentation; generating (3) text-based and (4) multimodal content (e.g., scientific figures and diagrams); and (5) AI-based automatic peer review. In this survey, we provide an in-depth overview over these exciting recent developments, which promise to fundamentally alter the scientific research process for good. Our survey covers the five aspects outlined above, indicating relevant datasets, methods and results (including evaluation) as well as limitations and scope for future research. Ethical concerns regarding shortcomings of these tools and potential for misuse (fake science, plagiarism, harms to research integrity) take a particularly prominent place in our discussion. We hope that our survey will not only become a reference guide for newcomers to the field but also a catalyst for new AI-based initiatives in the area of "AI4Science".


Reviews: On Testing for Biases in Peer Review

Neural Information Processing Systems

Thank you to the authors for the detailed response. It addresses most of my concerns. I hope the authors do include a discussion of effect sizes as they suggest in the response, since effect sizes are perhaps the most important thing to assess for a problem like this. I now see I misunderstood the importance of the assignment mchanism's confound in experimental design, compared to simple random assignment analysis; thank you for that clarification. The paper would still be strengthened if it related the problem to how it's addressed in the causal inference and experimental design literature, but the work is still a worthwhile contribution on its own.


Review for NeurIPS paper: Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments

Neural Information Processing Systems

Summary and Contributions: This paper aims to improve the reviewer-paper matching algorithms that many computer science conferences use to assign reviewers to submitted papers. Most conferences currently employ a deterministic algorithm with a linear program at its core that maximizes the total match quality (sum of similarity scores) subject to load balancing constraints ensuring that no reviewer is assigned too many papers and every paper is assigned enough reviewers. A problem with a deterministic algorithm is that unethical reviewers can manipulate their similarity scores (either through bids or submitted features) in order to try to get assigned one particular paper in order to boost it or nuke it. Another problem with a deterministic algorithm is that it cannot be shared to the public without the public being able to reverse engineer the match and reveal the reviewers assigned to a paper. The authors show that both problems can be alleviated by going with a randomized algorithm.


Ontology-Enhanced Educational Annotation Activities

Gayoso-Cabada, Joaquí, Goicoechea-de-Jorge, María, Gómez-Albarrán, Mercedes, Sanz-Cabrerizo, Amelia, Sarasa-Cabezuelo, Antonio, Sierra, José-Luis

arXiv.org Artificial Intelligence

Information and communications technology and technology-enhanced learning have unquestionably transformed traditional teaching-learning processes and are positioned as key factors to promote quality education, one of the basic sustainable development goals of the 2030 agenda. Document annotation, which was traditionally carried out with pencil and paper and currently benefits from digital document annotation tools, is a representative example of this transformation. Using document annotation tools, students can enrich the documents with annotations that highlight the most relevant aspects of these documents. As the conceptual complexity of the learning domain increases, the annotation of the documents may require comprehensive domain knowledge and an expert analysis capability that students usually lack. Consequently, a proliferation of irrelevant, incorrect, and/or poorly decontextualized annotations may appear, while other relevant aspects are completely ignored by the students. The main hypothesis proposed by this paper is that the use of a guiding annotation ontology in the annotation activities is a keystone aspect to alleviate these shortcomings. Consequently, comprehension is improved, exhaustive content analysis is promoted, and meta-reflective thinking is developed. To test this hypothesis, we describe our own annotation tool, \@note, which fully implements this ontology-enhanced annotation paradigm, and we provide experimental evidence about how \@note can improve academic performance via a pilot study concerning critical literary annotation.


Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Zhuo, Zhijian, Wang, Ya, Zeng, Yutao, Li, Xiaoqing, Zhou, Xun, Ma, Jinwen

arXiv.org Artificial Intelligence

Transformers have found extensive applications across various domains due to the powerful fitting capabilities. This success can be partially attributed to their inherent nonlinearity. Thus, in addition to the ReLU function employed in the original transformer architecture, researchers have explored alternative modules such as GeLU and SwishGLU to enhance nonlinearity and thereby augment representational capacity. In this paper, we propose a novel category of polynomial composition activations (PolyCom), designed to optimize the dynamics of transformers. Theoretically, we provide a comprehensive mathematical analysis of PolyCom, highlighting its enhanced expressivity and efficacy relative to other activation functions. Notably, we demonstrate that networks incorporating PolyCom achieve the $\textbf{optimal approximation rate}$, indicating that PolyCom networks require minimal parameters to approximate general smooth functions in Sobolev spaces. We conduct empirical experiments on the pre-training configurations of large language models (LLMs), including both dense and sparse architectures. By substituting conventional activation functions with PolyCom, we enable LLMs to capture higher-order interactions within the data, thus improving performance metrics in terms of accuracy and convergence rates. Extensive experimental results demonstrate the effectiveness of our method, showing substantial improvements over other activation functions. Code is available at https://github.com/BryceZhuo/PolyCom.


Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments

Neural Information Processing Systems

We consider three important challenges in conference peer review: (i) reviewers maliciously attempting to get assigned to certain papers to provide positive reviews, possibly as part of quid-pro-quo arrangements with the authors; (ii) "torpedo reviewing," where reviewers deliberately attempt to get assigned to certain papers that they dislike in order to reject them; (iii) reviewer de-anonymization on release of the similarities and the reviewer-assignment code. On the conceptual front, we identify connections between these three problems and present a framework that brings all these challenges under a common umbrella. We then present a (randomized) algorithm for reviewer assignment that can optimally solve the reviewer-assignment problem under any given constraints on the probability of assignment for any reviewer-paper pair. We further consider the problem of restricting the joint probability that certain suspect pairs of reviewers are assigned to certain papers, and show that this problem is NP-hard for arbitrary constraints on these joint probabilities but efficiently solvable for a practical special case. Finally, we experimentally evaluate our algorithms on datasets from past conferences, where we observe that they can limit the chance that any malicious reviewer gets assigned to their desired paper to 50% while producing assignments with over 90% of the total optimal similarity.